These are steps I take to analyze the TilS growth curves. Some things in the parse_96wellGrowthCurves.R script are specific to the 48-hour curves I run, but I tried to make the other parts more general. So, you might need to tailor that script to match what the different plate readers give you, if you adjust the run-times.
The basics steps however are:
This requires some familiarity with R (such as understanding how to call a function, navigating filing structures, and lists), plus a little understanding of ggplot2. I strongly encourage you to:
This will track exactly what lines of code your ran (which leave out what you did not), as well Document your analysis and save your results together, as if in a lab notebook. (See this resource for more in depth help with notebooks.) Using the code chunks in a notebook will also help with troubleshooting, as the pipeline will be broken up.
Troubleshooting is a huge part of programming. Don’t be scared of errors or the forums.
For general R help: R for Data Science, Google, stackoverflow, and any results from STDHA and datanovia are very helpful. Searching for the exact wording of an error, or just what you’re trying to accomplish in general can typically generate good results. I also like to use the “Environment” pane in RStudio to see what variables I have in use. Clicking on their names will also allow you to view them in matrix format, where you can hover over column names for data type info. This is often helpful for debugging.
(Also, I am a bit of a novice to coding, so there are definitely multiple ways of accomplishing similar goals. Some much more elegant than these. But I don’t know them yet.)
These scripts rely on pieces of code that other smart individuals have created. R, and many other programming languages, have allow you to modularly add-on that code needed.
But first you need to download (install) them, and then load them (kind of like attaching.)
You may have tons of packages installed, but only need to load the ones that you will be using for a certain analysis. (You do need to keep in mind that you need to load the packages each time you open RStudio as well, if you stop mid analysis.)
Most packages are stored in an online respository that R can access easily (called CRAN). Typing install.packages( and then starting to type the name of the package may autofill what you want. If it doesn’t, enter the name in quotations. RStudio may ask you to pick a mirror site, typically choosing one geographically close is fastest. (If it is not available on the CRAN repositories, visit that packages website/git page. There should be installation instructions.)
These packages are what I run. Some of them are not entirely necessary, but I use in this tutorial and make things easier/nicer. I have marked them. In particular, if you are only running the parsing script to format data you only need the packages marked “Needed for Parsing.” (To not load a package, comment that line of code out with Ctrl+Shift+C when on that line, or while highlighting multiple lines.)
library(rstatix)
library(openxlsx) # Needed for Parsing
library(DescTools) # Needed for calculating AUC
library(multcomp) # Needed for assigning compact letter displays
library(janitor)
library(reshape2)
library(paletteer) # Not necessary
library(scales) # Not entirely necessary
library(colorspace) # Necessary only for pairwise plots
library(emmeans)
library(growthrates)
library(tidyverse) # Needed for Parsing
library(here) # Not necessary, just what I use to make pathing things easier, and only dependent on project sub-folders.
source(here("parse_96wellGrowthCurves.R")) # This is the script I wrote, just make sure to path to wherever you keep it. And Needed for Parsing.
You should see a bunch of red text, some of which will talk about objects being masked form packages. The order of loading packages can matter, as the last package to load that uses a shared object name, is the one that gets to use that name by default.
The last line should be a call that tells you the folder where the here package is rooting all of your paths out from. Now, whenever you use a line like here("a folder", "a file.pdf"), the function will look for “a file.pdf” in the “a folder” sub-folder of the base directory. (Typically the base directory is automatically set as the project folder that contains the “.proj” file.) In these examples, I’ve used the here package to create semi-nonspecfic directions to the file (which is useful for when your project folders move around, but the R-notebook and the files stay together. It assumes that the data structure within a project folder stays the same.) The here function also removes system-specific jargon (Windows use instead of / in filing, unlike other OSs.)
(As a note, the here function name is used by multiple packages, so sometimes you’ll think you are using here from the here package, but you are using the function from another, say stats. If a line of code is giving you problems contains a here(), try putting in place here::here() which tells R to specifically go to the here package and use that version of the function. If we instead wanted the stats version, we would use stats::here() to override whatever R is defaulting on.)
If you make a new R-project folder with your experimental data in, you can copy over the script and notebook into the main project folder, and things should run just fine.
These are packages that are helpful in making great plots, but not necessary for this tutorial (and so have been commented out.)
# library(ggrepel) # Not necessary
# library(ggpubr
# library(grid) # I think this is uncessary.
# library(gtable) # I think ggplot already loads this.
# library(ggtext) # A great way to get text in plots to have extra formatting
# library(gg4x) # Fantastic for extra color scales and nested facet plots
As a general tip: the “Environment” tab in the upper right corner of RStudio can be really nice to quickly see what stored objects you have, and what is stored in them by clicking on the blue arrow (to get a drop down summary of structure and conents) or their name (to pop open a new tab, where you can see all of the contents and hover over the column names to see data type.) This is great when trying to figure out what exactly a function (or you) did to your data.
These scripts require that you have two things for each plate:
The template spread sheet must have columns named Well_ID, Strain, Condition, Replicate (within the plate), and Plate (helpful when multiple plates will be run in a series). There are also requirements for what you fill-in:
Replicate column, i.e. “C-2” would be for technical replicate 2 of biolgoical replicate C for a specific sample.Plate entries should be identical the entire way down the plate, and each plate should have its own template. (In theory you could use the same template file, and then change the plate name for each data frame individually.)I keep a clean plate_template.xlsx, and then make a new copy to fill-in for each plate. (In the process of filling them out, I resort the columns to speed up numbering replicates. BUT resorting them into the A1-A12 format is difficult for Excel.)
The specifics of how you word items can help you out later on. For example:
separate command.The plate reader you use will determine which function you will call.
parse_growth_Spark for the Tecan Spark (coming from Tecan’s Spark Control software)parse_growth_Versa_xlsx for the two VersaMax (coming from SoftMax Pro software, and saved as an xlsx)Both versions of the function expect:
datafilewithpath: The data file with a path. (e.g. “C:Documents1.xlsx”)
here() Explanationtemplatewithpath: The template file with a path.output: What format you want the output data in (options are “R”, “prism”, “python”). This mostly affects how it formats the time column. This is useful if you only intend to use this to automate formatting before moving into Prism.remove_emptys: Whether or not to remove the media blanks.
FALSE. You will want to then want to modify the format_GC function in the R script itself. See the comments in that section for help.parse_96wellGrowthCurves.R file, and re-run source(here...) line to load the changes.outputfoldername: A name for the folder where it will write a copy of the parsed out data as csv file. It will create the folder if it does not exist already. If you dont’ want to save an external copy, don’t list this option.totalduration: The total run time as a number. The Tecan version must be either 24 or 48, but the VersaMax can handle up to 48 hours.intervalinminutes: The time between each read, as a whole number of minutes. This needs to divide an hour evenly (e.g. 15 minutes is good, but 18 minutes is not.)(If your raw data and/or template is not in sub-folder named “data”, either delete or change that part of the here() call.)
If you have extra kinetic methods beyond shaking and reading, the Tecan Spark code may need lines 118 and possibly 122 in parse_96wellGrowthCurves.R adjusted to account for where the reads start/end. Just look at the excel spreadsheet that the plate reader gives you, and find the numbers of the rows that the raw data starts on (the numbers, not the column headers), and change the instances of54 (and 54+readsperday+24 if you are doing 48 hours curves) to the row numbers. Don’t forget to save, and then either click the “Source” button on the top right, or re-run source(here("parse_96wellGrowthCurves.R")) line in this notebook.
Here I’m loading 5 different plates, all from the same series. (The R and C naming is to help me remember what each experiment was. Name these according to what helps you.)
R1 <- parse_growth_Spark(datawithpath = here("data", "1-10-2021_RKS-swap1.xlsx"),
templatewithpath = here("data", "1-10-2021_plate_template.xlsx"),
output = "R", remove_emptys = TRUE, outputfoldername = "parsed",
totalduration = 48, interval = 10)
## Saving: C:/Users/mjf123/Documents/R/Growth_Curves_in_R/parsed/1-10-2021_RKS-swap1_parsedforR.csv
C2 <- parse_growth_Spark(here("data", "1-17-2021_CSH-swap2.xlsx"),
here("data", "1-17-2021_plate_template.xlsx"),
"R", remove_emptys = TRUE, outputfoldername = "parsed",
totalduration = 48, interval = 10)
## Saving: C:/Users/mjf123/Documents/R/Growth_Curves_in_R/parsed/1-17-2021_CSH-swap2_parsedforR.csv
C3 <- parse_growth_Spark(here("data", "1-20-2021_CSH-swap3.xlsx"),
here("data", "1-20-2021_plate_template.xlsx"),
"R", remove_emptys = TRUE, outputfoldername = "parsed",
totalduration = 48, interval = 10)
## Saving: C:/Users/mjf123/Documents/R/Growth_Curves_in_R/parsed/1-20-2021_CSH-swap3_parsedforR.csv
R2 <- parse_growth_Spark(here("data", "1-22-2021_RKS-swap2.xlsx"),
here("data", "1-22-2021_plate_template.xlsx"),
"R", remove_emptys = TRUE, outputfoldername = "parsed",
totalduration = 48, interval = 10)
## Saving: C:/Users/mjf123/Documents/R/Growth_Curves_in_R/parsed/1-22-2021_RKS-swap2_parsedforR.csv
R3 <- parse_growth_Spark(here("data", "1-24-2021_RKS-swap3.xlsx"),
here("data", "1-24-2021_plate_template.xlsx"),
"R", remove_emptys = TRUE, outputfoldername = "parsed",
totalduration = 48, interval = 10)
## Saving: C:/Users/mjf123/Documents/R/Growth_Curves_in_R/parsed/1-24-2021_RKS-swap3_parsedforR.csv
These functions don’t return a single dataframe (or matrix/spreadsheet0), but rather a list of 3 of them. Within these dataframes, the data has been formated to give each replicate its own sample_name of the format [strain].[condition].[platename].[replicate number on plate].
To access a single item from a list, you use listname[[itemnumber]] to access it (i.e. the fourth item in a list called mylist would be called as mylist[[4]].
So, the parse_growth_ functions have returned a list of 3 dataframes, and stored them in the variable written to the left of the arrow. These dataframes are:
names(R1[[1]]) # lists the column names of the first data frame in R1
## [1] "time" "A244T.CSH.Ca_Mg_swap_1.1"
## [3] "A244T.CSH.Ca_Mg_swap_1.2" "A244T.CSH.Ca_Mg_swap_1.3"
## [5] "A244T.RKS.Ca_Mg_swap_1.1" "A244T.RKS.Ca_Mg_swap_1.2"
## [7] "A244T.RKS.Ca_Mg_swap_1.3" "A244T.RKS_mod.Ca_Mg_swap_1.1"
## [9] "A244T.RKS_mod.Ca_Mg_swap_1.2" "A244T.RKS_mod.Ca_Mg_swap_1.3"
## [11] "N274Y.CSH.Ca_Mg_swap_1.1" "N274Y.CSH.Ca_Mg_swap_1.2"
## [13] "N274Y.CSH.Ca_Mg_swap_1.3" "N274Y.RKS.Ca_Mg_swap_1.1"
## [15] "N274Y.RKS.Ca_Mg_swap_1.2" "N274Y.RKS.Ca_Mg_swap_1.3"
## [17] "N274Y.RKS_mod.Ca_Mg_swap_1.1" "N274Y.RKS_mod.Ca_Mg_swap_1.2"
## [19] "N274Y.RKS_mod.Ca_Mg_swap_1.3" "N445K.CSH.Ca_Mg_swap_1.1"
## [21] "N445K.CSH.Ca_Mg_swap_1.2" "N445K.CSH.Ca_Mg_swap_1.3"
## [23] "N445K.RKS.Ca_Mg_swap_1.1" "N445K.RKS.Ca_Mg_swap_1.2"
## [25] "N445K.RKS.Ca_Mg_swap_1.3" "N445K.RKS_mod.Ca_Mg_swap_1.1"
## [27] "N445K.RKS_mod.Ca_Mg_swap_1.2" "N445K.RKS_mod.Ca_Mg_swap_1.3"
## [29] "P421L.CSH.Ca_Mg_swap_1.1" "P421L.CSH.Ca_Mg_swap_1.2"
## [31] "P421L.CSH.Ca_Mg_swap_1.3" "P421L.RKS.Ca_Mg_swap_1.1"
## [33] "P421L.RKS.Ca_Mg_swap_1.2" "P421L.RKS.Ca_Mg_swap_1.3"
## [35] "P421L.RKS_mod.Ca_Mg_swap_1.1" "P421L.RKS_mod.Ca_Mg_swap_1.2"
## [37] "P421L.RKS_mod.Ca_Mg_swap_1.3" "tRNA-Ile2.CSH.Ca_Mg_swap_1.1"
## [39] "tRNA-Ile2.CSH.Ca_Mg_swap_1.2" "tRNA-Ile2.CSH.Ca_Mg_swap_1.3"
## [41] "tRNA-Ile2.RKS.Ca_Mg_swap_1.1" "tRNA-Ile2.RKS.Ca_Mg_swap_1.2"
## [43] "tRNA-Ile2.RKS.Ca_Mg_swap_1.3" "tRNA-Ile2.RKS_mod.Ca_Mg_swap_1.1"
## [45] "tRNA-Ile2.RKS_mod.Ca_Mg_swap_1.2" "tRNA-Ile2.RKS_mod.Ca_Mg_swap_1.3"
## [47] "WT.CSH.Ca_Mg_swap_1.1" "WT.CSH.Ca_Mg_swap_1.2"
## [49] "WT.CSH.Ca_Mg_swap_1.3" "WT.RKS.Ca_Mg_swap_1.1"
## [51] "WT.RKS.Ca_Mg_swap_1.2" "WT.RKS.Ca_Mg_swap_1.3"
## [53] "WT.RKS_mod.Ca_Mg_swap_1.1" "WT.RKS_mod.Ca_Mg_swap_1.2"
## [55] "WT.RKS_mod.Ca_Mg_swap_1.3"
nrow(R1[[1]]) # counts the number of rows in this data frame
## [1] 288
names(R1[[2]])
## [1] "sample_name" "strain" "condition" "plate" "rep"
## [6] "time" "absorbance"
nrow(R1[[2]])
## [1] 15552
names(R1[[3]])
## [1] "sample_name" "strain" "condition" "plate" "rep"
(Note that 1 and 2 are typically interconverted, and 3 can be dervied from 1 or 2.)
The plot_from_melted_plate_faceted and plot_from_melted_plate functions are both simple functions, meant to be a fast way to look at each plate. This is when you get see if there are outliers, etc. (They can take more than 1 plate’s-worth of data in one data frame, but formatting and customiazability are limited.) The first returns a single plot faceted by whatever your different conditions are, the second returns 1 plot per condition. Both require you to give them the data frame (no path needed this time), a (named) color vector for coloring by strain, a title in string format, and the type of scaling on the y-axis you would like (“straight”, “log2”, “log10”, “natural_log”.)
A detour to cover colors…
If I had different strains across different plates and didn’t care that colors matched, I could just create an unnamed color vector.
I’m going to use the paletteer package’s palette constructors. The package makes accessing a lot of the preset palettes easier. Auto-fill is your friend here (start typing without quotation marks). It also has other functions aside from palette constructors to change colors when manually making your plot within the plotting call.
Here I’m going to call three colors from a discrete (that’s what the d is for) palette.
sixcolors.noname <- paletteer_d("nationalparkcolors::Saguaro", n = 6)
show_col(sixcolors.noname)
If I had really specific colors that I new the hex code for, I could also enter them as a list of strings into the sixcolors vector.
I like named color vectors to keep colors consistent across plots. For this, we just need to make a vector of color codes, each entry named to match the levels of whatever variable you’d like to color on. Here we’re matching number of strains
First, find how many colors you’ll need for each. Here I know that the strains/samples I ran on each plate are all the same, so I’m only testing R1.
unique(R1[[2]]$strain)
## [1] "A244T" "N274Y" "N445K" "P421L" "tRNA-Ile2" "WT"
length(unique(R1[[2]]$strain))
## [1] 6
Spefically this call looks at all of the values under “strain” in the second data frame in the list, and lists the unique values. The length() function counts, which is useful if you have a lot of something.
If we had had different strains on each plate, but wanted to keep the colors consistent, you would need to append each list of strains together and maybe save it as that seems like it would be handy.
otherstrains <- c("P421L", "P421L", "tRNA-Lys", "PPC", "PPC", "tRNA-Lys", "A244T", "WT")
total_strains <- unique(c(R1[[2]]$strain, otherstrains, R2[[2]]$strain))
total_strains
## [1] "A244T" "N274Y" "N445K" "P421L" "tRNA-Ile2" "WT"
## [7] "tRNA-Lys" "PPC"
length(total_strains)
## [1] 8
Now we can use a palette constructor to generate six colors, and use names() to set the sames of each color.
eightcolors <- paletteer_d("vapoRwave::hotlineBling", 8)
show_col(eightcolors)
If I don’t yet care about the ordering of colors, I can use the previous type of call to make assiging names easy, without typing them all out.
names(eightcolors) <- total_strains
If you wish to start assigning names to specific colors, note the order of the colors, then enter the names of each strain in a list in the matching order.
Note that because of the order of the two lists, the cyan and dark blue color will never be shown in the following graphs, because those strains don’t actually show up on the plates.
These functions are meant to get a quick look at the data. Modifying the plots is easier when you build the plot yourself.
plot_from_melted_plate_faceted(.data = R1[[2]], colors = eightcolors, title = "Modified RKS, plate 1", yscale = "straight")
This can be hard to see with the scaling, so you might want to be able to page through each condition. Note that the condition plotted is appended to the end of your title.
plot_from_melted_plate(R1[[2]], eightcolors, "Modified RKS, plate 1", "straight")
## [[1]]
##
## [[2]]
##
## [[3]]
These sorts of quick plots are good for giving you a general idea of what is going on. Also, that I should cut-off my data around 34 to 36 hours. You should run through all of your plates to see how each replicate did as well. From these, I already know that CSH-plate1 was no good, so that is why there isn’t a C1.
The melted format (where all plates have the same column names) makes combining data easy. We will simply attach each melted data frame to the bottom of the last. You should have as many entries as you had plates.
allplates <- bind_rows(R1[[2]], R2[[2]], R3[[2]], C2[[2]], C3[[2]])
nrow(allplates)
## [1] 77760
The total number of rows should be the total number of wells you’re considering (not blanks) times the number of time points you have.
Again we can also get an idea of what all of the data looks like on the same plot via individual replicates.
plot_from_melted_plate(allplates, sixcolors.noname, "All plates", "straight")
## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
You can see that my CSH_mod conditions definitely vary by plate, but overall trends are maintained.
In order to start getting things ready to plot, and to analyze, we should remove the baseline from the absorbance readings, and order the strains and conditions in a meaningful way. We can do that with format_GC, which will:
To start, let’s see how R ordered my strains and conditions:
unique(allplates$strain)
## [1] "A244T" "N274Y" "N445K" "P421L" "tRNA-Ile2" "WT"
unique(allplates$condition)
## [1] "CSH" "RKS" "RKS_mod" "CSH_mod"
This corresponds to the order that R read them in, and the order that each plate’s dataframe was attached. This is fine, but I would like to compare to WT (so list it first), and the conditions are step-wise variations. So, matching the exact names already being used, enter them in lists in the order you wish them to appear. This can be helpful when you are using numbers that are not actually in “order” but rather are names.
df <- format_GC(allplates,
strainfactors = c("WT", "A244T", "N274Y", "N445K", "P421L", "tRNA-Ile2"),
conditionfactors = c("RKS", "RKS_mod", "CSH_mod", "CSH"))
unique(df$strain)
## [1] A244T N274Y N445K P421L tRNA-Ile2 WT
## Levels: WT < A244T < N274Y < N445K < P421L < tRNA-Ile2
unique(df$condition)
## [1] CSH RKS RKS_mod CSH_mod
## Levels: RKS < RKS_mod < CSH_mod < CSH
If you check out the column names of df, you can see that the function created some columns, and renamed others.
names(df)
## [1] "sample_name" "strain" "condition" "plate" "rep"
## [6] "time" "OD" "OD.corrected"
This function can only handle formatting one condition variable, however if you have more than one…
(To start, I would still run the format_GC function to let it do its other functions. Simply replace the list of condition factors )
In this specific experiment, I actually have two condition factors (Ion and Buffer recipe/levels) that I’ve currently been treating as one condition factor. (And will still also use as an example later.) When it comes time for the stat analysis, I’d like to be able to control and compare using these. As might be obvious from the condition labels, each condition value dictates the values of two other factors I’d like to be able to analyze: Buffer and Ions.
The exact breakdown of my condition into the sub-conditions looks like:
| Condition value | Buffer | Ion formula |
|---|---|---|
| RKS | RKS | RKS |
| RKS_mod | RKS | CSH |
| CSH | CSH | CSH |
| CSH_mod | CSH | RKS |
So now I’d like to extract these two variables from the condition variable.
Depending on how the condidtion variable is structured, you will have two different approaches to extracting the levels/values for each variable:
condition.If the condition names are not easily broken down into sub-variables, you’ll have to assign them case by case.
Because I did not name my conditions with compound conditions, extracting the two different variables from one condition variable will take a little work. The general flow is: for each new variable, examine what is in the original condition varaible at each row, and assign a value on a case-by-case basis. The conditions were named after the buffering components of the media, so I assigned Buffer using a general serach for the two different buffers. The Ions required an individual assigment for each possible value of the condition variable.
To add extra variable columns, we’ll use the following functions:
%>% to continuously modify the data frame df, by passing it through functions that then “return” a modified df to hand offmutate to create a new column/variable and assign it values at each row. If you make the new column name (the first part before the equal sign) the same as the old column name, you will just replace/modify the existing column.grepl to find which the values we want. Because of how I named these conditions, I need to search for any instances of “RKS” or “CSH” in the condition variable, regardless of whether they are by themselves, or part of a larger string. (Note, I could just have repeated what I did for the Ions variable up in the Buffer part, and written out each case explicitly, but I let R do the legwork for me.)case_when to conditionally assign values to this new variable based on the values in other columns/variables for that row
~) is the case we’re checking for. This should evaluate to a boolean (TRUE/FALSE).~) is the value to put store in the variable at that row. Here we have character strings.factor to turn each variable into a factor, with a specific ordering. (This is how the format_GC function works.)dfExtended <- df %>% mutate(
Buffer = case_when(grepl("RKS", condition) ~ "RKS",
grepl("CSH", condition) ~ "CSH"),
Ions = case_when(condition == "RKS" ~ "RKS",
condition == "RKS_mod" ~ "CSH",
condition == "CSH" ~ "CSH",
condition == "CSH_mod" ~ "RKS")
) %>%
mutate(
Buffer = factor(Buffer, levels = c("RKS", "CSH"), ordered = TRUE),
Ions = factor(Ions, levels = c("RKS", "CSH"), ordered = TRUE)
)
Compound naming makes this part super easy. The separate function takes a column (here condition'), looks for the separator on which to split the values (here a hyphen), and then stores them sequentially in the new columns you give it (theinto=parameter with the list of variables). I would let tell it to keep the original column intact (theremove=parameter). The function is smart to expect as many pieces as you give it variables. This example expects 2 hyphpens and three chunks of text for each entry in theconditioncolumn. It will warn you if there are too many or too few pieces. If you know that there will be discrepencies, you can dictact the funcitons behavior withdrop=andfill=arguments respectively (see theseparate` documentation for more help.)
(This part of code is hypothetical and should not be run for the tutorial.)
dfExtended <- df %>% separate(condition, into = c("First Variable", "Second Variable", "Third Variable"),
remove = FALSE,
sep = "-") %>%
mutate(
Buffer = factor(Buffer, levels = c("RKS", "CSH"), ordered = TRUE),
Ions = factor(Ions, levels = c("RKS", "CSH"), ordered = TRUE)
)
Because using two condition varialbes is just an extension of using one, I will procede through this example with two variables. (Where typically simplifying down to one means just deleting the second variable.)
We can also look at a plot of the means at each time point using mean_OD and bare_meanplot plot. mean_OD just needs the data frame you wish to calculate the mean ODs for, and how you want to group the data. It will return a data frame columns for OD-corrected absorbance, mean, and error stats.
Most often for general growth curve representation, we want to group by time, strain, and condition. The grouping variables need to be entered in a list with quotes around each. There is a row each strain/condition combination. (Here 6 strains \(\times\) 4 conditions \(\times\) 288 time points.)
df.means <- mean_OD(dfExtended, c("time", "strain", "Buffer", "Ions"))
nrow(df.means)
## [1] 6912
Note that none of the raw data is in the data frame.
To plot this, bare_meanplot needs the mean data a (named) color list. It creates an extremely basic ggplot to which you can add faceting, scaling, etc., as it can take a lot of fine-tuning to get a final plot you are happy with. I’m going to store it in a variable, so that I can use it as a base later and add other ggplot layers. To view it, just call the its name.
means.plot <- bare_meanplot(df.means, eightcolors)
means.plot
This looks terrible! That’s because we have multiple conditions all mixed together, and ggplot is trying to put them in one line.
To break up by conditions, use facet_wrap or facet_grid. Here I’m also using a ggsave to save a copy of the plot as a pdf (the function will adjust appropriately by you just changing the extension of the file name) with modified dimensions. (Again, to save it in a sub-folder, add another item in quotes setoff by a comma in the here call, like here("asubfolder", "testplot.pdf")) (Also, if you have the pdf open and are trying to save over it, Rstudio will get mad.)
testplot <- means.plot + facet_wrap(Buffer + Ions ~ .)
testplot
ggsave(here("testplot.pdf"), testplot, height = 8, width = 6)
Note that the colors stayed consistent with my first plots, and the order changed.
To get fancy, I’d like to trim my x-axis and re-label the faceting variable by both the variable and its value. I’m also going to save fancy, but adding the date into the save name (which can be great if you keep remkaing plots.) Also note that I didn’t need to specify the name of the plot object, it’s just going to save the last plot called.
means.plot +
scale_x_continuous(limits = c(10, 40)) +
facet_wrap( Buffer + Ions ~ . , labeller = label_both)
## Warning: Removed 642 row(s) containing missing values (geom_path).
ggsave(here(paste(Sys.Date(), "testplot.png")), testplot, height = 8, width = 6)
R will warn you that you are omitting data points.
Returning again to the single variable condition, to display names other than what is in the conditions variable, we’ll use a trick similar to the named color vectors, and make a named list of labels. Because we’ll feed them into ggplot’s labelling functions, you can use special notation (with label_wrap_gen) or plotmath notation (with label_parsed) to get italics, sub/superscripts, and symbols. (Google will be your best friend here. As it always is with all coding.) We will use a much messier version later, too.
The list assigned to the names call must match the values in the condition variable. Again, make sure that the order of the lists match each other (they don’t necessarily have to match the display order).
conds.labels <- c("RKS Buffer\nRKS Ions", "RKS Buffer\nCSH Ions", "CSH Buffer\nRKS Ions", "CSH Buffer\nCSH Ions")
names(conds.labels) <- c("RKS", "RKS_mod", "CSH_mod", "CSH")
OK, for the plot (event though the means are the same as before, the condition varaible as not included before, so we’re simplying to make another (almost idential) dataframe):
df.means.single <- mean_OD(dfExtended, c("time", "strain", "condition"))
means.plot.single <- bare_meanplot(df.means.single, eightcolors)
means.plot.single +
scale_x_continuous(limits = c(10, 40)) +
facet_wrap(condition~., labeller = as_labeller(conds.labels, label_wrap_gen(width = 12)))
## Warning: Removed 642 row(s) containing missing values (geom_path).
Warnings about removed rows stem from the fact that I have essentially censored the data, i.e. it is confused that there are rows with values that are not being displayed.
For more options and help tweaking the plots, I really like STDHA’s tutorials that pop up in Google search results for “ggplot2 + ______”, like “change axis ticks” or “title”. I tried to keep what I declared minimal, so that ggplot would not get angry about things gettting rewritten; but, if it causes problems, you can modify either a copy the body of the code for the bare_meanplot or the script itself.
Just as an example, let’s say I wanted to seach the averages of each plate (instead of letting them average together.) We’ll use mean_OD again, but with another grouping variable.
df.plateMeans <- mean_OD(dfExtended, c("time", "strain", "Buffer", "Ions", "plate"))
Now we’ll plot in a grid, with the conditions as columns, and plates as rows. (Because the facet labels don’t include values for the plates, I had to modfiy the labeller call to specifically say, “Only apply this function to the condition variables.” If I could either append more values onto the named labels, or create another named list of labels and add a second item in that call labeller = labeller(Ions = ..., plate = ...))
bare_meanplot(df.plateMeans, eightcolors) +
scale_x_continuous(limits = c(10, 40)) +
scale_y_continuous(limits = c(0, 1.3)) +
facet_grid(plate ~ Buffer + Ions, labeller = labeller(Ions = label_both, Buffer = label_both))
## Warning: Removed 642 row(s) containing missing values (geom_path).
There are tons of options! Again, check out the stdha and datanovia pages for any ggplot2 topic.
Looking at curves is great, but now you’d like fitness measurements. So we have to calculate those.
This part relies heavily on the growthrates package, with a helpful tutorial here. Explanations of their different functions and the models they use.
All of them will need a data frame with base-line removed absorbance data, but no negative values.
To begin, growth rates gives you two groups of options for models to fit with different pros and cons: simpler “easy linear” and “smoothing splines” or the more complex parametric non-linear models.
Typically you will calculate one AUC for the whole curve, but in some instances you may only wish to look at part of the growth curve. We are using AUC as a proxy for fitness, so you can decide what makes sense for you. Here we’re only looking at 24 and 36 hours even though I know WT goes up to 48 hours because fitness at 24 hours is important experimentally for me, and then 36 gives a fuller look at fitness disparity. Going beyond the time when the mutants hit carrying capacity is somewhat “unfair”: area-under-the-curve gives more weight to higher ODs a faster times (even though once they hit stationary phase, they are not doing a whole lot.)
These are two different functions that calcualte AUC up until different points that we will run for each well.
calc.auc24 <- function(df){
df24 <- subset(df, time <= 24)
auc <- AUC(df24$time, df24$OD.corrected)
return(as.numeric(auc))
}
calc.auc36 <- function(df){
df36 <- subset(df, time <= 36)
auc <- AUC(df36$time, df36$OD.corrected)
return(as.numeric(auc))
}
To calcualte the AUCs, will make use of a nesting function: for each unique condition/strain/plate/rep/sample_name combo (aka each well), it will make one row, and nest all of the experimental data as a table/frame (time, OD, and OD.corrected) in an entry in the column called data. Then it will use the map() function to apply the auc functions to the nested tables at each row, and store the output in auc24 or auc36. (And then force them to be stored as numbers.) I also realized that the entries in the rep column weren’t stored as numbers, but as characters.
df.nest <- dfExtended %>%
dplyr::group_by(condition, strain, plate, rep, sample_name) %>%
nest(data = c(time, OD, OD.corrected)) %>%
mutate(auc24 = map(data, calc.auc24),
auc24 = as.numeric(auc24),
auc36 = map(data, calc.auc36),
auc36 = as.numeric(auc36),
rep = as.numeric(rep)) %>%
ungroup(condition, strain, plate, rep, sample_name)
Note that these are empircial AUCs: we fit the area under raw data points, not under a fitted curve. Now that we have the AUCs calculated, onto the other parameters that we are going to get from fitted curves.
It’s a good idea to have a cut-off time, beyond which you will ignore the data when fitting curves. This typically affects ODs after the cultures reach stationary phase, as dips, sudden take-offs, and erratic readings can throw off the fitting or not necessarily be representative of what the population is doing/more an artifact of growth in a 96-well plate.
For the example purposes, I’m going to set the cutoff time to be 48 hours because WT is so slow for me, that to fit an accurate curve, it will need all of the data points. If your curves hit stationary at extremely different times, it may be worthwhile considering different cutoff times for different conditions/strains. (I could stand it do it for this set.)
dfExtended.trim <- filter(dfExtended, time <= 36)
For these to work, we’re not going to look at base-line corrected values; some of the fitting functions use logarithmic transformations and 0-values cause problems.
As a note, the fitting can take several minutes, so don’t be alarmed if Rstudio appears to be stuck evaluating. (I try to close open programs before running these.)
These use a formula format to figure out what is x and y, and what grouping variables are important. In the following examples OD is the y, time is the x, and everything after the vertical pipe | are the groupings we care about. Here the combination of strain, plate, replicate, Buffer condition, and Ion condition make it so that we are fitting one curve per well/growth curve. If we were to only have strain + Buffer + Ions after the pipe, the fitting functions would be “averaging” all of the replicates across all of the plates, by fitting one curve to 9 different OD values at each time point, because we did not specify that the plate and rep variables were important groupings.
The easiest to fit, but which give the least info are the “easy linear” and “simple splines”. They require minimal input and take the shortest amount of time to run (although than can be several minutes, depending on the number of plates-worth of data that you are analyzing.) The only information you get from these are v-max (maximum early log-linear growth rate, written as mumax for the Greek letter mu, shorthand for growth rate), lag, and the y-intercept (which we know should be around 0).
The function itself returns the fit and not the parameters or a table, which can take a little getting used to.
rates.easy <- all_easylinear(OD ~ time | strain + plate + rep + Buffer + Ions,
data = dfExtended.trim, grouping = strain, spar = 0.5, time = "time", y = "OD")
You can see each of the fits. This gets its own line because it’s large.
plot(rates.easy)
The coefficients themselves and the r-squared, “goodness of fit” estimation can be extracted from the fits with results from the growthrates package.
rates.easy.coef <- results(rates.easy)
These curves fit the full Gompertz growth model to each curve, meaning that they return carrying capacity in addition to v-max and lag. (There are other models that you can fit, see the growthrates documentation of growthmodel for more options.)
Three-parameter Gompertz equation y0 = initial value mumax = maximum growth rate K = carrying capacity lambda = lag time in hours
Because of how numerical fitting works, we have to give the function starting estimates of what we think each parameter should be, and also constraints on the parameters of what is unreasonable. Pick these number based on your original plots of the curves. mumax can be hard to guess, but it’s typically between 0.1 and 5.
We are going to “set the seed” so that the pseudo-random aspects of curve fitting are consistent between fittings. (If your starting values are near a bifurcation point (starting at values greater than that point veers one direction, starting below the value veers another), the fitting algorithms could product different fits each time.) If you were to fit again later, setting to the same seed guarantees the same results. You can put any number you like here.
(Also note that this example chooses to fit log transformed data, which tends to ensure the log-phase growth is well fit, but not necessarily the stationary phase. Try out transform = "none" to see what looks best.)
## Setting the seed for the pseudo-RNG
set.seed(8)
## initial parameters and box constraints
p <- c(y0 = 0.1, mumax = .2, K = 2, lambda = 10)
lower <- c(y0 = 0.001, mumax = 0.01, K = 1, lambda = 5)
upper <- c(y0 = 0.3, mumax = 2, K = 5, lambda = 24)
## fit growth models to all data using log transformed residuals
many_gompertz1 <- all_growthmodels(
OD ~ grow_gompertz3(time, parms) | strain + plate + rep + Buffer + Ions,
data = dfExtended.trim,
p = p, lower = lower, upper = upper,
transform = "log", ncores = 2)
Now we’re going to re-fit to try to hone in even more, because our starting guesses are not often very good. This is the opportunity to adjust the lag time (lambda) to whatever is in the ballpark of what you think it should be based on the original plots.. You might want to do this if you don’t expect lag to be changing very much, so as to re-focus where the parameters should be changing. It will help disentangle the interplay between growth rate (mumax) and lag (lambda), which you can imagine as vertical line crossing the x-axis at, say, 10. Rotating the line clockwise (making the slope less steep) will move the x-intercept (or lag) away from 10 and towards 0.
## use coefficients of first fit as new initial parameters
pp <- coef(many_gompertz1)
pp[, "lambda"] <- 17 # This line re-aligns the lag to the ballpark you can see it at in the original plots.
## re-fit models
many_gompertz2 <- all_growthmodels(
OD ~ grow_gompertz3(time, parms) | strain + plate + rep + Buffer + Ions,
data = dfExtended.trim,
p = pp, lower = lower, upper = upper,
transform = "log", ncores = 2)
To store the fits in a matrix/data table format, we extract them with results(). It’s good to take a look at these numbers, in comparison to your limits. The View() function is similar to clicking the variable name in the Environment window. If you see numbers that are bumping up against your limits, go back and try tweaking the initial parameters.
df.params <- results(many_gompertz2)
nrow(df.params)
## [1] 270
# View(df.params)
There are as many rows as there are wells. So this next call is going to make 270 plots if we let it. When you are evaluating your own data, it can be good to look at each of them. Here I only want to give examples, so I subset with [] and a range.
plot(many_gompertz2[10:15])
They look pretty good, but it’s also a good idea to peak at the residuals to get an idea of the goodness of the fits. You’re looking for linear, and centered about 0.
plot(residuals(many_gompertz2))
There are some not so great residuals in there.
We can look at the worst fits:
resids <- residuals(many_gompertz2)
names(head(resids[order(resids)], 25))
## [1] "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y" "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y"
## [3] "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y" "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y"
## [5] "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y" "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y"
## [7] "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y" "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y"
## [9] "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y" "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y"
## [11] "N274Y:Ca_Mg_swap_6:2:RKS:CSH.log_y" "N274Y:Ca_Mg_swap_6:2:RKS:CSH.log_y"
## [13] "N274Y:Ca_Mg_swap_6:2:RKS:CSH.log_y" "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y"
## [15] "N274Y:Ca_Mg_swap_6:2:RKS:CSH.log_y" "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y"
## [17] "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y" "N274Y:Ca_Mg_swap_6:2:RKS:CSH.log_y"
## [19] "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y" "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y"
## [21] "N274Y:Ca_Mg_swap_6:2:RKS:CSH.log_y" "N274Y:Ca_Mg_swap_6:2:RKS:CSH.log_y"
## [23] "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y" "N274Y:Ca_Mg_swap_6:3:RKS:CSH.log_y"
## [25] "P421L:Ca_Mg_swap_6:1:RKS:CSH.log_y"
names(tail(resids[order(resids)], 25))
## [1] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [2] "WT:Ca_Mg_swap_6:3:RKS:CSH.log_y"
## [3] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [4] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [5] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [6] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [7] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [8] "WT:Ca_Mg_swap_1:2:RKS:CSH.log_y"
## [9] "WT:Ca_Mg_swap_6:1:RKS:CSH.log_y"
## [10] "WT:Ca_Mg_swap_6:2:RKS:CSH.log_y"
## [11] "tRNA-Ile2:Ca_Mg_swap_1:1:RKS:CSH.log_y"
## [12] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [13] "WT:Ca_Mg_swap_6:3:RKS:CSH.log_y"
## [14] "WT:Ca_Mg_swap_1:2:RKS:CSH.log_y"
## [15] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [16] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [17] "WT:Ca_Mg_swap_1:2:RKS:CSH.log_y"
## [18] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [19] "WT:Ca_Mg_swap_1:2:RKS:CSH.log_y"
## [20] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [21] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [22] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [23] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [24] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
## [25] "tRNA-Ile2:Ca_Mg_swap_4:3:CSH:RKS.log_y"
These all come from the same conditions (which are hard to decipher from how I named my two condition variables), but it’s the CSH Buffer, RKS Ions combo, which the media crashed. So these make sense to me, and I’m not worried.
The spread of the r-squared values, which the closer to 1 the better the fit.
ggplot(df.params, aes(r2)) +
geom_freqpoly(binwidth = 0.0001)
These look great.
Just going to do some tidying of the parameter data frame to match the factorizing from before.
df.params <- df.params %>% mutate(
strain = factor(strain, levels = c("WT", "A244T", "N274Y", "N445K", "P421L", "tRNA-Ile2"), ordered = TRUE),
Buffer = factor(Buffer, levels = c("RKS", "CSH"), ordered = TRUE),
Ions = factor(Ions, levels = c("RKS", "CSH"), ordered = TRUE)
)
This can be a good time to save the data you have externally. So we’d like to combine the AUCs with the extracted parameters for each well. We’ll ignore the nested experimental data from the data frame with the AUCs, and join it to the parameter data frame, and specifying the names of which varialbes need to line up. (The factorizing we did before is improtant to get these data frames to join well.)
df.measures <- full_join(
df.nest %>% select(-data),
df.params,
by = c("strain", "plate", "rep", "Buffer", "Ions" )
)
write.xlsx(file = here(paste0("calculated_GCmeasures_", Sys.Date(), ".xlsx")), df.measures)
We are going to take advantage of the fact that R is good for vectorizing functions to perform multiple anovas (one for each measurement) at the same time.
(Note that there are more in depth comparisons you can do that consider each variable in different ways.)
There’s a lot of stuff in that measures data frame, but we only want to analyse certain measures. We’re going to prep by taking what we want, shifting to long format from wide, and nesting again.
df.measures.nest <- df.measures %>%
select(strain, Buffer, Ions, condition, sample_name, auc24, auc36, mumax, K, lambda) %>%
pivot_longer(cols = c(auc24, auc36, mumax, K, lambda), names_to = "measures", values_to = "values") %>%
group_by(measures) %>%
nest() %>%
ungroup(measures)
Again, we’re going to use the map() function to:
strain, Buffer, Ions) Note that I have put the variable I believe has the greatest affect first: strain. Then the next, Buffer and last Ions.Buffer and Ions. It’s using a Sidak method of adjusting. Again I put the variable I think has the most effect first after the pipe.(We are kind of getting ahead of ourselves with the means testing before knowing if there is signficance, but we will look at them.) (The emmeans documentation and mutliple tutorial pages go more into adjustment methods.)
df.aovs <- df.measures.nest %>% mutate(
aovs = map(data, ~aov(values ~ strain*Buffer*Ions, data = .x)),
mt.byConds = map(aovs, ~emmeans(.x, pairwise ~ strain|Buffer*Ions, type = "response", adjust = "sidak")),
clds.byConds = map(mt.byConds, ~cld(.x[[1]], alpha = 0.05, adjust = "sidak"))
)
Note, trying to view df.aovs almost crashes Rstudio.
We can look at the overall results of the anovas. (pwalk is similar to map, but it takes a list of vectors, and iterates in applies a function in parallel, similar to performing the function at each row)
pwalk(df.aovs %>% select(measures, aovs),
function(measures, aovs){
cat("\nAnova results for: ", measures, "\n")
print(summary(aovs))
})
##
## Anova results for: auc24
## Df Sum Sq Mean Sq F value Pr(>F)
## strain 5 63.45 12.691 127.855 < 2e-16 ***
## Buffer 1 0.01 0.011 0.112 0.73774
## Ions 1 0.14 0.136 1.368 0.24337
## strain:Buffer 5 2.00 0.401 4.036 0.00154 **
## strain:Ions 5 0.34 0.068 0.690 0.63148
## Buffer:Ions 1 2.71 2.711 27.316 3.69e-07 ***
## strain:Buffer:Ions 5 1.36 0.273 2.749 0.01946 *
## Residuals 246 24.42 0.099
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Anova results for: auc36
## Df Sum Sq Mean Sq F value Pr(>F)
## strain 5 475.2 95.04 110.608 < 2e-16 ***
## Buffer 1 83.0 83.03 96.636 < 2e-16 ***
## Ions 1 125.7 125.71 146.309 < 2e-16 ***
## strain:Buffer 5 33.4 6.69 7.781 8.13e-07 ***
## strain:Ions 5 4.2 0.85 0.989 0.42499
## Buffer:Ions 1 9.0 9.05 10.530 0.00134 **
## strain:Buffer:Ions 5 3.9 0.79 0.916 0.47138
## Residuals 246 211.4 0.86
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Anova results for: mumax
## Df Sum Sq Mean Sq F value Pr(>F)
## strain 5 0.00507 0.00101 4.275 0.000956 ***
## Buffer 1 0.03976 0.03976 167.754 < 2e-16 ***
## Ions 1 0.25110 0.25110 1059.349 < 2e-16 ***
## strain:Buffer 5 0.00350 0.00070 2.955 0.013079 *
## strain:Ions 5 0.00980 0.00196 8.265 3.09e-07 ***
## Buffer:Ions 1 0.00528 0.00528 22.267 3.98e-06 ***
## strain:Buffer:Ions 5 0.00148 0.00030 1.245 0.288669
## Residuals 246 0.05831 0.00024
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Anova results for: K
## Df Sum Sq Mean Sq F value Pr(>F)
## strain 5 9.00 1.8008 5.883 3.72e-05 ***
## Buffer 1 0.15 0.1489 0.486 0.486231
## Ions 1 0.64 0.6399 2.090 0.149501
## strain:Buffer 5 9.22 1.8443 6.025 2.79e-05 ***
## strain:Ions 5 6.80 1.3596 4.442 0.000683 ***
## Buffer:Ions 1 0.15 0.1506 0.492 0.483653
## strain:Buffer:Ions 5 3.39 0.6788 2.218 0.053179 .
## Residuals 246 75.30 0.3061
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Anova results for: lambda
## Df Sum Sq Mean Sq F value Pr(>F)
## strain 5 500.7 100.15 146.151 < 2e-16 ***
## Buffer 1 3.4 3.36 4.900 0.02778 *
## Ions 1 132.3 132.30 193.077 < 2e-16 ***
## strain:Buffer 5 27.6 5.51 8.042 4.83e-07 ***
## strain:Ions 5 12.3 2.45 3.581 0.00382 **
## Buffer:Ions 1 41.2 41.20 60.127 2.35e-13 ***
## strain:Buffer:Ions 5 7.4 1.48 2.164 0.05870 .
## Residuals 246 168.6 0.69
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We are OK to look at the means testing. And there is a lot to try to get from these results just looking, but let’s plot.
Note that I did not cover the validity of doing an ANOVA. Functions to test sphericity and distribution can be applied in a similar manner to the other mutate calls, in the form of: mutate( newvariable = map(data, ~testingFunction)).
Labeling is going to require linking what the meausrements are called in our data frames (like auc24) to how we actually would like them displayed. That display unfortunately can require some plotmath and expression shennanigans.
Most importantly though, there are as many expression-style list entries in measure.labels as there are measures what we did calculations for, AND they get assigned names that match to the data frame entries.
If you change what times you calcualte the AUCs at, make sure to change the labels, and the names.
measure.labels <- c("'AUC,'~24~h~(AU%*%h)",
"'AUC,'~36~h~(AU%*%h)", "Lag~(h)",
"Capacity~(AU)",
"v[max]~scriptstyle(bgroup( '(', frac('AU', 'h'), ')' ))")
names(measure.labels) <- c("auc24", "auc36", "lambda", "K", "mumax")
This next part are functions that I made it try to keep from typing the same formatting of factorized variables. These are specific to my strains and entries, but can be modified to what you have. Make sure the levels = list exactly match all of the unique entries in the variable you are factorizing. If you’d like to change how they display, you can also add a labels = call, with a list in the same order as the levels but what you’d like them replaced with.
# Takes a vector/list
plot_format.strains <- function(vec){
vec <- factor(vec, levels = c("WT", "A244T", "N274Y", "N445K", "P421L", "tRNA-Ile2"), ordered = TRUE)
return(vec)}
#Takes a whole data frame and modifies specific columns
plot_format.factorialconds <- function(df){
df %>% mutate(
Buffer = factor(Buffer, levels = c("RKS", "CSH"), ordered = TRUE),
Ions = factor(Ions, levels = c("RKS", "CSH"), ordered = TRUE),
strain = factor(strain, levels = c("WT", "A244T", "N274Y", "N445K", "P421L", "tRNA-Ile2"), ordered = TRUE))
}
Colors! I love colors! I’m coloring by strain, so I need 6 colors. And then I name them again.
sixcolor <- paletteer_c("viridis::viridis", 6)
names(sixcolor) <- unique(dfExtended$strain)
“Raw” data points from the calcualted values put into long format, which ggplot2 likes better.
plotdf.data <- df.measures %>% ungroup() %>%
pivot_longer(cols = c(auc24, auc36, mumax, K, lambda), names_to = "measures", values_to = "values")
When we go to label the plots with the letters, we need to tell ggplot what y-position to put them at.
First, we’re going to group our data by what variables are important to us for plotting (I want points of the same strain, measure, and condition(which is a Buffer and Ions combo) to plot together). Then we’re going to find the maximum value within that group, store it as a number, and factorize the whole data frame. Note that this data frame will have as many rows as we have measure/strain/Buffer/Ions combinations. (5 x 6 x 2 x 2)
df.maxes <- plotdf.data %>%
group_by(measures, condition, strain, Buffer, Ions) %>%
summarize(maxes = max(values), .groups = "keep") %>%
mutate(maxes = as.numeric(maxes)) %>%
ungroup(c(measures, condition, strain, Buffer, Ions)) %>%
plot_format.factorialconds
Now we define a function that looks at the maximum, and depending on what measure it’s looking at, it adds a value to put the label above the biggest value. You should adjust the actual numbers for what makes sense for your data.
add_label_ypos <- function(df){
df %>%
mutate(label.start = case_when(
maxes > upper.CL ~ maxes,
TRUE ~ upper.CL
),
y.position = label.start + case_when(
measures == "auc24" ~ 0.4,
measures == "auc36" ~ 1.6,
measures == "mumax" ~ 0.03,
measures == "lambda" ~ 1,
measures == "K" ~ 0.2))
}
Now we actually use this function, the factorizing function, and another in the parse_96wellGrowth script that reformats the cld output to create a data frame that can plot the means, confidence intervals, and the CLDs. This can be a good data frame to save, too.
df.cld <- df.aovs %>% select(measures, clds.byConds) %>%
unnest(clds.byConds) %>%
mutate(.group.letter = lettersub(.group)) %>%
plot_format.factorialconds
df.cld.plot <- full_join(df.cld, df.maxes,
by = c("strain", "measures", "Buffer", "Ions")) %>%
add_label_ypos
write.xlsx(file = here(paste0("calculated_GCmeasures_CLDs_", Sys.Date(), ".xlsx")), df.cld.plot)
This is a beast, so I tried to include a lot of notes.
Also, there is a jittering function in here to make the data points easier to see. It works on the random seed, however. So if you don’t like where your dots are you can use either of the commented-out set.seed() calls to let the seed do whatever (NULL) or try out different numbers to get different but reproducible results.
# set.seed(NULL) # Let the jitter do whatever it wants
# set.seed(294) # Set the seed to a specific number you can change at will
measureplot.conditions <- ggplot(data = df.cld.plot, aes(x = strain, color = strain)) +
# This is me highlighting a specific plot
geom_rect(data = subset(df.cld.plot, condition == "RKS"),
# fill = "#757575", alpha = 0.2,
fill = NA,
colour = "black",
xmin = -Inf,xmax = Inf,
ymin = -Inf,ymax = Inf) +
# Define the colors
scale_color_manual(values = sixcolor) +
# You may or may not need this line
scale_x_discrete(drop = FALSE) +
# Plots the points
geom_jitter(data = plotdf.data,
mapping = aes(x = strain, y = values, color = strain),
width = 0.25) + # This bit tells it how far apart the points can move
# Add the means, error bars, and CLDs
geom_errorbar(aes(y=emmean, ymin=lower.CL, ymax=upper.CL), color = "white",
size = 0.6, width = 0.3, alpha = 0.9) +
geom_point(aes(y = emmean),size = 1.5, color = "white") +
geom_text(aes(y=y.position, label=.group.letter,
color = strain)) +
scale_y_continuous(position = "left", sec.axis = dup_axis()) + # Duplication to split title from scale
# This defining how to facet, and also telling it how to process the labeling
# for each label I'm faceting by. as_labeller and label_parsed tell it to use a parsing function
# to get the fancy formatting in the list of labels.
# The Buffer and Ions label_both tells it to display the variable name as well as value.
facet_grid(measures~ Ions + Buffer, scales = "free_y", switch = "y",
labeller = labeller(measures = as_labeller(measure.labels, label_parsed),
Buffer = label_both, Ions = label_both)) +
ylab("Measure") +
theme_dark() +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
plot.title = element_text(margin = margin(b = 10)),
axis.title.y.left = element_text(margin = margin(r = 8)), # These are all to split the axis title and labels
axis.title.y.right = element_blank(),
axis.text.y.left = element_blank(),
axis.ticks.y.left = element_blank(),
axis.text.x = element_text(angle = 50, hjust = 1),
# axis.ticks.x = element_blank(),
# axis.title = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 12),
plot.caption = element_text(lineheight = 0.5),
# panel.background = element_rect(fill = "grey65"),
# panel.background = element_rect(fill = "grey65", color = "black"),
# strip.background = element_rect(fill = "white", color = "black"),
legend.position = "none") # I have turned off the legend
measureplot.conditions
Look into the ggplot2 tutorials if you wish to change the coloring of the panels and grids, etc. I’ve turned certain things “off” with element_blank() or “none” for the legend.
You can save again:
ggsave(here(paste0("measurePlot_byCondition_", Sys.Date(), ".png")), measureplot.conditions,
width = 8, height = 10)
The CLDs can be nice, but might not show the complete picture.
You can see the pairwise comparisons better with a bit of heat map.
This next bit manipulates the aovs to extract the pairwise comparisons, and tests for signficance. I’ve highlighted what my need to change for your conditions. (I’ve simplified the two-variable condition, back down to a 1-variable condition.)
pairwise.byConds <- df.aovs %>% select(measures, mt.byConds) %>%
mutate(cons = map(mt.byConds, ~ .$contrasts %>% summary(infer = TRUE) %>% as.data.frame)) %>%
select(-mt.byConds) %>% unnest(cons) %>%
separate(contrast, into = c("var1", "var2"), sep = " - ") %>%
mutate(var2 = gsub("[()]", "", var2),
signif = case_when(p.value < 0.05 ~ "is",
p.value >= 0.05 ~ "not"),
var1 = plot_format.strains(var1),
var2 = plot_format.strains(var2),
`P-value` = format(round(p.value, 3), 3),
diff = round(estimate, 3),
condition = interaction(Ions,Buffer) # Might need to change for your conditions
)
Now, we extract the estimated marginal means as a data frame. (Again simplifying a 2-variable condition into 1.)
df.byConds.means <- df.aovs %>%
mutate( em = map(mt.byConds, ~ .$emmeans %>% as.data.frame)) %>%
select(measures, em) %>% unnest(em) %>%
mutate(em = round(emmean, 3),
condition = interaction(Ions, Buffer)) # Might need to change for you
For this heat map, it’s also good to consider two different coloring schema. One lets the color extremes be defined by max/min just within a measure, the second defines the extremes for each condition. And it flips the color direction of the lag varaible lambda because smaller lag is greater fitness (opposite from the rest.)
So we define functions to adjust color values, and then apply it to differently grouped data frames:
scale_values <- function(x){
x/max(abs(x))
}
pairwise.byConds <- pairwise.byConds %>%
group_by(measures) %>%
mutate(scaled.diff = scale_values(diff)) %>% ungroup(measures) %>%
mutate(scaled.diff = case_when(measures == "lambda" ~ 0 - scaled.diff,
TRUE ~ scaled.diff)) %>%
group_by(measures, condition) %>%
mutate(scaled.diff2 = scale_values(diff)) %>% ungroup(measures, condition) %>%
mutate(scaled.diff2 = case_when(measures == "lambda" ~ 0 - scaled.diff2,
TRUE ~ scaled.diff2))
Ok, now the actual heat map plot.
First, colors scaled to all measures seen. (You can comment/uncomment out the different color scales to get an idea of other options.) Also palette options for the divergingx call can be: ArmyRose, Earth, Fall, Geyser, TealRose, Temps, Tropic, PuOr, RdBu, RdGy, PiYG, PRGn, BrBG, RdYlBu, RdYlGn, Spectral, Zissou 1, Cividis, Roma.
pairwiseplot.conditions <- ggplot(pairwise.byConds, aes(var2, var1)) +
geom_text(data = df.byConds.means,
aes(x = strain, y = strain, label = em),
size = 2, color = "black", fontface = "bold") +
# I played with different color scales
scale_fill_continuous_divergingx(palette = "Geyser", mid = 0, guide = FALSE) +
# scale_fill_gradient2(high = "blue3", mid = "green", low = "red", midpoint = 0, guide = FALSE) +
geom_tile(data = . %>% filter(signif == "is"), aes(fill = scaled.diff), color = "white") +
geom_text(data = . %>% filter(signif == "not"),
aes(label = diff), color = "grey75", size = 2, na.rm = TRUE) +
geom_text(data = . %>% filter(signif == "is"),
aes(label = diff), color = "white", size = 2, na.rm = TRUE) +
# guides(fill = FALSE, color = FALSE) +
scale_y_discrete(limits = rev) +
ylab("Measure") +
labs(caption = "Means on diagonal. Differences listed above, with strain on the left compared to condition on the bottom.
Tiles with significant differences within a condition have been filled in.
Means testing adjusted by Sidak method, after two-way ANOVA with a cutoff of 0.05.") +
facet_grid(measures ~ Buffer + Ions, labeller = labeller(measures = as_labeller(measure.labels, label_parsed),
.cols = label_both), switch = "y") +
theme_bw() +
theme(axis.text.x = element_text(angle = 50, hjust = 1),
axis.title.y.left = element_text(margin = margin(r = 8)), # These are all to split the axis title and labels
axis.text.y = element_text(margin = margin(l = 4)),
panel.grid = element_blank(),
axis.title.x = element_blank(),
strip.placement = "outside")
pairwiseplot.conditions
Changing just the fill = scaled.diff2 in the geom_tile layer, will switch to coloring scaled within a condition and measure. This data set is not extremely obvious, bt the top right plot maybe shows it best.
pairwiseplot.conditions.2 <- ggplot(pairwise.byConds, aes(var2, var1)) +
geom_text(data = df.byConds.means,
aes(x = strain, y = strain, label = em),
size = 2, color = "black", fontface = "bold") +
# I played with different color scales
scale_fill_continuous_divergingx(palette = "Geyser", mid = 0, guide = FALSE) +
# scale_fill_gradient2(high = "blue3", mid = "green", low = "red", midpoint = 0, guide = FALSE) +
geom_tile(data = . %>% filter(signif == "is"), aes(fill = scaled.diff2), color = "white") +
geom_text(data = . %>% filter(signif == "not"),
aes(label = diff), color = "grey75", size = 2, na.rm = TRUE) +
geom_text(data = . %>% filter(signif == "is"),
aes(label = diff), color = "white", size = 2, na.rm = TRUE) +
# guides(fill = FALSE, color = FALSE) +
scale_y_discrete(limits = rev) +
ylab("Measure") +
labs(caption = "Means on diagonal. Differences listed above, with strain on the left compared to condition on the bottom.
Tiles with significant differences within a condition have been filled in.
Means testing adjusted by Sidak method, after two-way ANOVA with a cutoff of 0.05.") +
facet_grid(measures ~ Buffer + Ions, labeller = labeller(measures = as_labeller(measure.labels, label_parsed),
.cols = label_both), switch = "y") +
theme_bw() +
theme(axis.text.x = element_text(angle = 50, hjust = 1),
axis.title.y.left = element_text(margin = margin(r = 8)), # These are all to split the axis title and labels
axis.text.y = element_text(margin = margin(l = 4)),
panel.grid = element_blank(),
axis.title.x = element_blank(),
strip.placement = "outside")
pairwiseplot.conditions.2
I prefer to NOT scale between conditions, so you get a better comparison among all conditions.
ggsave(here(paste0("pairwiseHeatMap_byCondition_", Sys.Date(), ".png")), pairwiseplot.conditions,
width = 12, height = 9)
You can mirror the values/colors the whole way across the plot, which can be nicer to read when trying to interpret your data, but more confusing as well.
pairwiseplot.conditions.mirror <- ggplot(pairwise.byConds, aes(var2, var1)) +
geom_text(data = df.byConds.means,
aes(x = strain, y = strain,label = em),
size = 2, color = "black", fontface = "bold") +
scale_fill_continuous_divergingx(palette = "Geyser", mid = 0, guide = FALSE) +
# scale_fill_gradient2(high = "blue3", mid = "green", low = "red", midpoint = 0) +
geom_tile(data = . %>% filter(signif == "is"), aes(fill = scaled.diff), color = "white") +
geom_text(data = . %>% filter(signif == "not"),
aes(label = diff), color = "grey75", size = 2, na.rm = TRUE) +
geom_text(data = . %>% filter(signif == "is"),
aes(label = diff), color = "white", size = 2, na.rm = TRUE) +
# Mirror
geom_tile(data = . %>% filter(signif == "is"), aes(x = var1, y = var2,
fill = 0 -scaled.diff), color = "white") +
geom_text(data = . %>% filter(signif == "not"),
aes(x = var1, y = var2,
label = 0 - diff), color = "grey75", size = 2, na.rm = TRUE) +
geom_text(data = . %>% filter(signif == "is"),
aes(x = var1, y = var2,
label = 0 - diff), color = "white", size = 2, na.rm = TRUE) +
# guides(fill = FALSE, color = FALSE) +
scale_y_discrete(limits = rev) +
ylab("Measure") +
labs(caption = "Means on diagonal. Differences listed above, with strain on the left compared to condition on the bottom.
Tiles with significant differences within a condition have been filled in.
Means testing adjusted by Sidak method, after two-way ANOVA with a cutoff of 0.05.") +
facet_grid(measures ~ Buffer + Ions, labeller = labeller(measures = as_labeller(measure.labels, label_parsed),
.cols = label_both), switch = "y") +
theme_bw() +
theme(axis.text.x = element_text(angle = 50, hjust = 1),
axis.title.y.left = element_text(margin = margin(r = 8)), # These are all to split the axis title and labels
axis.text.y = element_text(margin = margin(l = 4)),
panel.grid = element_blank(),
axis.title.x = element_blank(),
strip.placement = "outside")
pairwiseplot.conditions.mirror
I like this one the best
ggsave(here(paste0("pairwiseHeatMap_byCondition_mirrored_", Sys.Date(), ".png")), pairwiseplot.conditions.mirror,
width = 12, height = 9)
To do this is very similar to what we just did. We don’t need to re-do the anovas, but now calculate new means test. This formula is saying "at every level of strain, consider each combination of Ions and Buffer. (Other possible tests are controlling for Buffer and strain: ~ Ions|Buffer*strain or by Ions and strain ~ Buffer|Ions*strain)
The labeling of the plots will take a little more work, too. I haven’t quite figured this part out yet.
So we modfiy the same aova data frame, and add more means tests and clds.
df.aovs <- df.aovs %>% mutate(
mt.byStrains = map(aovs, ~emmeans(.x, pairwise ~ Ions*Buffer|strain, type = "response", adjust = "sidak")), #Within a strain, look at conditions
clds.byStrains = map(mt.byStrains, ~cld(.x[[1]], alpha = 0.05, adjust = "sidak"))
)
Now we only have 4 conditions to color over. I’m going to shortcut making variable, but using interaction to create every possible combination of the lists I give it (here just two lists, each two entries long.)
fourcolor <- paletteer_d("LaCroixColoR::PassionFruit", 5)[c(1,3,4,5)]
names(fourcolor) <- interaction(unique(df.measures$Buffer), unique(df.measures$Ions))
Again, formatting data and adding y-positions. When we plot, the interaction term is just going to combine whatever is in the Buffer column with what’s in the Ions column for that row.
df.cld.byStrains <- df.aovs %>% select(measures, clds.byStrains) %>%
unnest(clds.byStrains) %>%
mutate(.group.letter = lettersub(.group)) %>%
plot_format.factorialconds
df.cld.byStrain.plot <- full_join(
df.cld.byStrains,
df.maxes,
by = c("strain", "measures", "Buffer", "Ions")) %>%
add_label_ypos
measureplot.strains <- ggplot(data = df.cld.byStrain.plot, aes(x = interaction(Buffer, Ions),
color = interaction(Buffer,Ions))) +
# Define the colors
scale_color_manual(values = fourcolor, name = "Buffer.Ions") +
# You may or may not need this line
scale_x_discrete(drop = FALSE) +
# Plots the points
geom_jitter(data = plotdf.data,
mapping = aes(x = interaction(Buffer, Ions), y = values, color = interaction(Buffer, Ions)),
width = 0.25) + # This bit tells it how far apart the points can move
# Add the means, error bars, and CLDs
geom_errorbar(aes(y=emmean, ymin=lower.CL, ymax=upper.CL), color = "black",
size = 0.6, width = 0.3, alpha = 0.9) +
geom_point(aes(y = emmean),size = 1.5, color = "black") +
geom_text(aes(y=y.position, label=.group.letter,
color = interaction(Buffer, Ions)), show.legend = FALSE) +
scale_y_continuous(position = "left", sec.axis = dup_axis()) + # Duplication to split title from scale
# This defining how to facet, and also telling it how to process the labeling
# for each label I'm faceting by. as_labeller and label_parsed tell it to use a parsing function
# to get the fancy formatting in the list of labels.
# The Buffer and Ions label_both tells it to display the variable name as well as value.
facet_grid(measures~ strain, scales = "free_y", switch = "y",
labeller = labeller(measures = as_labeller(measure.labels, label_parsed))) +
ylab("Measure") +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
plot.title = element_text(margin = margin(b = 10)),
axis.title.y.left = element_text(margin = margin(r = 8)), # These are all to split the axis title and labels
axis.title.y.right = element_blank(),
axis.text.y.left = element_blank(),
axis.ticks.y.left = element_blank(),
axis.text.x = element_text(angle = 50, hjust = 1),
# axis.ticks.x = element_blank(),
# axis.title = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 12),
plot.caption = element_text(lineheight = 0.5),
# panel.background = element_rect(fill = "grey65"),
# panel.background = element_rect(fill = "grey65", color = "black"),
# strip.background = element_rect(fill = "white", color = "black")
)
measureplot.strains
This is a little harder to read for this data. I included the legend/guide to help with labeling… but.. yeah.
Ok, extracting data again. There is more separating this time, because extra variables were in the contrasts.
pairwise.byStrains <- df.aovs %>% select(measures, mt.byStrains) %>%
mutate(cons = map(mt.byStrains, ~ .$contrasts %>% summary(infer = TRUE) %>% as.data.frame)) %>%
select(-mt.byStrains) %>% unnest(cons) %>%
separate(contrast, into = c("var1", "var2"), sep = " - ") %>%
separate(var1, into = c("var1.Ions", "var1.Buffer"), sep = " ") %>%
separate(var2, into = c("var2.Ions", "var2.Buffer"), sep = " ") %>%
mutate(across(c(var1.Buffer, var2.Buffer, var1.Ions, var2.Ions),
~ factor(.x, levels = c("RKS", "CSH"), ordered = TRUE)),
signif = case_when(
p.value < 0.05 ~ "is",
p.value >= 0.05 ~ "not"),
`P-value` = format(round(p.value, 3), 3),
diff = round(estimate, 3)
)
Oh yeah, also need the actual means.
df.byStrains.means <- df.aovs %>%
mutate( em = map(mt.byStrains, ~ .$emmeans %>% as.data.frame)) %>%
select(measures, em) %>% unnest(em) %>%
mutate(em = round(emmean, 3),
condition = interaction(Buffer, Ions))
pairwise.byStrains <- pairwise.byStrains %>%
group_by(measures) %>%
mutate(scaled.diff = scale_values(diff)) %>% ungroup(measures) %>%
mutate(scaled.diff = case_when(measures == "lambda" ~ 0 - scaled.diff,
TRUE ~ scaled.diff)) %>%
group_by(measures, strain) %>%
mutate(scaled.diff2 = scale_values(diff)) %>% ungroup(measures, strain) %>%
mutate(scaled.diff2 = case_when(measures == "lambda" ~ 0 - scaled.diff2,
TRUE ~ scaled.diff2))
And the plot:
pairwiseplot.strains <- ggplot(pairwise.byStrains, aes(x =interaction(var2.Buffer, var2.Ions),
y = interaction(var1.Buffer, var1.Ions))) +
geom_text(data = df.byStrains.means,
aes(x = interaction(Buffer, Ions), y = interaction(Buffer, Ions), label = em),
size = 2, color = "black", fontface = "bold") +
scale_fill_continuous_divergingx(palette = "Fall", mid = 0, guide = FALSE) +
# scale_fill_gradient2(high = "blue3", mid = "green", low = "red", midpoint = 0) +
#Upper Triangle
geom_tile(data = pairwise.byStrains %>% filter(signif == "is"),
aes(fill = scaled.diff),
color = "white") +
geom_text(data = . %>% filter(signif == "not"),
aes(label = diff), color = "grey75", size = 2, na.rm = TRUE) +
geom_text(data = . %>% filter(signif == "is"),
aes(label = diff), color = "white", size = 2, na.rm = TRUE) +
# Mirrored lower Triangle
geom_tile(data = . %>% filter(signif == "is"),
aes(x = interaction(var1.Buffer, var1.Ions), y = interaction(var2.Buffer, var2.Ions),
fill = 0 -scaled.diff), color = "white") +
geom_text(data = . %>% filter(signif == "not"),
aes(x = interaction(var1.Buffer, var1.Ions), y = interaction(var2.Buffer, var2.Ions),
label = 0 - diff), color = "grey75", size = 2, na.rm = TRUE) +
geom_text(data = . %>% filter(signif == "is"),
aes(x = interaction(var1.Buffer, var1.Ions), y = interaction(var2.Buffer, var2.Ions),
label = 0 - diff), color = "white", size = 2, na.rm = TRUE) +
# guides(fill = FALSE, color = FALSE) +
scale_y_discrete(limits = rev) +
ylab("Measure") +
xlab("Buffer.Ions") +
labs(caption = "Means on diagonal. Differences listed above, with condition on the left compared to condition on the bottom.
Tiles with significant differences within a condition have been filled in.
Means testing adjusted by Sidak method, after two-way ANOVA with a cutoff of 0.05.") +
facet_grid(measures ~ strain, labeller = labeller(measures = as_labeller(measure.labels, label_parsed)), switch = "y") +
theme_bw() +
theme(axis.text.x = element_text(angle = 90, hjust = 1),
axis.title.y.left = element_text(margin = margin(r = 8)), # These are all to split the axis title and labels
axis.text.y = element_text(margin = margin(l = 4)),
panel.grid = element_blank(),
# axis.title.x = element_blank(),
strip.placement = "outside")
pairwiseplot.strains
sessionInfo()
## R version 4.0.5 (2021-03-31)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17134)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] datasets utils grDevices graphics stats methods base
##
## other attached packages:
## [1] colorspace_2.0-1 here_1.0.1 forcats_0.5.1 stringr_1.4.0
## [5] dplyr_1.0.6 purrr_0.3.4 readr_1.4.0 tidyr_1.1.3
## [9] tibble_3.1.1 ggplot2_3.3.3 tidyverse_1.3.1 growthrates_0.8.2
## [13] deSolve_1.28 lattice_0.20-44 DescTools_0.99.41 multcompView_0.1-8
## [17] multcomp_1.4-17 TH.data_1.0-10 MASS_7.3-54 survival_3.2-11
## [21] mvtnorm_1.1-1 emmeans_1.6.0 RColorBrewer_1.1-2 scales_1.1.1
## [25] paletteer_1.3.0 reshape2_1.4.4 janitor_2.1.0 openxlsx_4.2.3
## [29] rstatix_0.7.0
##
## loaded via a namespace (and not attached):
## [1] minqa_1.2.4 ellipsis_0.3.2 class_7.3-19 rio_0.5.26
## [5] rprojroot_2.0.2 estimability_1.3 snakecase_0.11.0 fs_1.5.0
## [9] gld_2.6.2 rstudioapi_0.13 proxy_0.4-25 farver_2.1.0
## [13] fansi_0.4.2 lubridate_1.7.10 xml2_1.3.2 codetools_0.2-18
## [17] splines_4.0.5 rootSolve_1.8.2.1 knitr_1.33 jsonlite_1.7.2
## [21] broom_0.7.6 dbplyr_2.1.1 compiler_4.0.5 httr_1.4.2
## [25] backports_1.2.1 assertthat_0.2.1 Matrix_1.2-18 cli_2.5.0
## [29] htmltools_0.5.1.1 tools_4.0.5 coda_0.19-4 gtable_0.3.0
## [33] glue_1.4.2 lmom_2.8 Rcpp_1.0.6 carData_3.0-4
## [37] cellranger_1.1.0 vctrs_0.3.8 xfun_0.22 rvest_1.0.0
## [41] lifecycle_1.0.0 zoo_1.8-9 hms_1.1.0 parallel_4.0.5
## [45] sandwich_3.0-1 expm_0.999-6 rematch2_2.1.2 prismatic_1.0.0
## [49] yaml_2.2.1 curl_4.3.1 Exact_2.1 stringi_1.5.3
## [53] highr_0.9 e1071_1.7-6 boot_1.3-28 zip_2.1.1
## [57] FME_1.3.6.1 rlang_0.4.11 pkgconfig_2.0.3 evaluate_0.14
## [61] labeling_0.4.2 tidyselect_1.1.1 plyr_1.8.6 magrittr_2.0.1
## [65] R6_2.5.0 generics_0.1.0 DBI_1.1.1 pillar_1.6.1
## [69] haven_2.4.1 foreign_0.8-81 withr_2.4.2 abind_1.4-5
## [73] modelr_0.1.8 crayon_1.4.1 car_3.0-10 utf8_1.2.1
## [77] rmarkdown_2.8 grid_4.0.5 readxl_1.3.1 minpack.lm_1.2-1
## [81] data.table_1.14.0 reprex_2.0.0 digest_0.6.27 xtable_1.8-4
## [85] munsell_0.5.0 viridisLite_0.4.0